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COLUSIONS: IDENTICAL PACKETS ARE RARE 
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- TEST FOR INDEPENDENCE OF SAMPUNG 
DECISION & ADDRESSES 
• \\C{T}< 1 - significance level => accept hypothesis 
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Optimal Sampling 
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• Fix amount of measurement traffic c per period 

• Tradeoff: collisions vs. label size 

• Problem: 

- n: number of samples in sampling period 

- M\ alphabet size, m = [bits/label] 

- 71' m: total amount of measurement traffic [bits] 

- Goal: maximize number of unique labels 

subject to n ' m < c. 

- Optimal alphabet size: Af* = clog(2) ^ 
Optimal number of samples: n* = ^ 



log(Af*) 

Example: c = 10^ bit =>m* = 19.4 bit/label 

n* = 5.15*10^ samples 




HASH-SAMPLED ADDRESS BnS DISTRIBUnONS. 

Qiiontile-quantile plot of address bit chi-square values vs. chi-squored distribution 
with 1 degree of freedom; for various traces, primes i, thinning factors r/4: see 
text. Close agreement for 40 byte packet prefixes; marked disagreement for 20 byte 
packet prefixes (i.e. no paylood included for sampling hash) 
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xlO Expected number of unique samples 4(n) and optimal n=n* 




The expected number of unique samples 4(n) as a function of % for c=10^ bit. 
The optimal number of samples n* is approximately 5.15 • 10*, withm*=19.4 
bit per label. The collision probability Pcoij is approximately 0.072, ie., 7.2% of 
the samples transmitted to the collection system have to be discarded. 
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Inference Experiment 
• Experiment: inference from trajectory samples 

- Estimate fraction of traffic from customer 

- Customer traffic: small source address subset 

customer 




• Fraction of customer traffic on backbone: \i 
Estimator: \i - 71^^5/715 

^c,b* # unique labels common on both links 
715: |f unique labels on backbone link 

• Ingress link and source address correlated 
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Estimoted Customer Traffic (c - 10^ [bits/epoch]) 
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Estimated Customer Traffic (c = 10^ [bits/epoch]) 
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